Improving Norwegian Translation of Bicycle Terminology Using Custom Named-Entity Recognition and Neural Machine Translation

نویسندگان

چکیده

The Norwegian business-to-business (B2B) market for bicycles consists mainly of international brands, such as Shimano, Trek, Cannondale, and Specialized. product descriptions these brands are usually in English need local translation. However, include bicycle-specific terminologies that challenging online translators, Google. For this reason, companies outsource translation or translate manually, which is cumbersome. In light the B2B bicycle industry, paper explores transfer learning to improve machine terminology from Norwegian, including generic text. Firstly, we trained a custom Named-Entity Recognition (NER) model identify cycling-specific then adapted MarianMT neural process. Due lack publicly available bicycle-terminology-related datasets train proposed models, created our dataset by collecting corpus cycling-related texts. We evaluated performance compared its with Google Translate. Our outperformed Translate on test set, SacreBleu score 45.099 against 36.615 average. also web application where user can input text related terminologies, it will return detected words addition

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Machine Translation Quality with Automatic Named Entity Recognition

Named entities create serious problems for state-of-the-art commercial machine translation (MT) systems and often cause translation failures beyond the local context, affecting both the overall morphosyntactic well-formedness of sentences and word sense disambiguation in the source text. We report on the results of an experiment in which MT input was processed using output from the named entity...

متن کامل

Selecting Translation Strategies in MT using Automatic Named Entity Recognition

We report on the results of an experiment aimed at enabling a machine translation system to select the appropriate strategy for dealing with words and phrases which have different translations depending on whether they are used as proper names or common nouns in the source text. We used the ANNIE named entity recognition system to identify named entities in the source text and pass them to MT s...

متن کامل

Using Wikipedia for Named-Entity Translation

In this paper we present a system for translating named-entities from Basque to English using Wikipedia’s knowledge. We can exploit interlingual links from Wikipedia (WIL) to get named-entity translation, but entities without interlingual links can be translated using the Wikipedia as a corpus, suggesting new interlingual links. In this second case the interlingual links can be used as a test c...

متن کامل

Named entity translation using anchor texts

This work describes a process to extract Named Entity (NE) translations from the text available in web links (anchor texts). It translates a NE by retrieving a list of web documents in the target language, extracting the anchor texts from the links to those documents and finding the best translation from the anchor texts, using a combination of features, some of which, are specific to anchor te...

متن کامل

Improved Named Entity Recognition using Machine Translation-based Cross-lingual Information

In this paper, we describe a technique to improve named entity recognition in a resource-poor language (Hindi) by using cross-lingual information. We use an on-line machine translation system and a separate word alignment phase to find the projection of each Hindi word into the translated English sentence. We estimate the cross-lingual features using an English named entity recognizer and the a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Electronics

سال: 2023

ISSN: ['2079-9292']

DOI: https://doi.org/10.3390/electronics12102334